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ABSTRACT 

Building  on  recent  interest  in  distributed  logic  programming,  we 
take  a  model-theoretic  approach  to  analyzing  confluence  of  asyn¬ 
chronous  distributed  programs.  We  begin  with  a  model-theoretic 
semantics  for  Dedalus  and  develop  the  concept  of  ultimate  models 
to  capture  the  non-deterministic  eventual  outcomes  of  distributed 
programs.  After  demonstrating  the  undecidability  of  checking  con¬ 
fluence  for  Dedalus  programs,  we  look  for  restricted  sub-languages 
that  guarantee  confluence  while  providing  adequate  expressivity. 
We  observe  that  a  simple  semipositive  restriction  called  Dedalus+ 
guarantees  confluence  while  capturing  PTIME,  but  demonstrate  that 
the  limited  use  of  negation  in  Dedalus+  makes  certain  simple  and 
practical  programs  very  difficult  to  express.  To  remedy  this,  we 
introduce  Dedalus-5  ,  a  restriction  of  Dedalus  that  allows  a  natural 
use  of  negation  in  the  spirit  of  stratified  negation,  but  retains  the 
confluence  of  Dedalus+  and  similarly  captures  PTIME. 

1.  INTRODUCTION 

In  recent  years  there  has  been  optimism  that  declarative  languages 
grounded  in  Datalog  can  provide  a  clean  foundation  for  distributed 
programming  [24],  This  has  led  to  activity  in  language  and  system 
design  (e.g.,  [4,  9,  14,  32]),  as  well  as  formal  models  for  distributed 
computation  using  such  languages  (e.g.,  [8,  36,  37]). 

The  bulk  of  this  work  has  presented  or  assumed  a  formal  oper¬ 
ational  semantics  based  on  transition  systems  and  traces  of  input 
events.  A  model-theoretic  semantics  for  these  languages  has  been 
notably  absent.  In  a  related  paper  [3],  we  have  developed  a  model- 
theoretic  semantics  for  Dedalus,  a  distributed  logic  language  based 
on  Datalog,  in  which  the  “meaning”  of  a  program  is  precisely  the 
set  of  stable  models  [38]  that  can  arise  via  all  possible  temporal 
permutations  of  messages.  In  the  same  paper,  we  demonstrate  an 
equivalence  of  these  models  with  all  possible  executions  in  a  opera¬ 
tional  semantics  akin  to  those  in  the  prior  literature. 

In  this  paper  we  take  advantage  of  the  availability  of  declara¬ 
tive  semantics  to  explore  the  correctness  of  distributed  programs. 
Specifically,  we  address  the  desire  to  ensure  deterministic  program 


outcomes — confluence — in  the  face  of  inherently  non-deterministic 
timings  of  computation  and  messaging.  This  is  a  matter  of  widespread 
practical  concern  in  distributed  systems,  often  cast  in  terms  of  “even¬ 
tual  consistency”  [40,  39],  and  grounded  in  foundational  issues  of 
time  and  clocks  in  the  theory  of  distributed  computing  [27]. 

Using  our  model-theoretic  semantics  for  Dedalus,  we  can  rea¬ 
son  about  the  set  of  possible  outcomes  of  a  distributed  program, 
based  on  what  we  define  as  their  ultimate  models.  This  formal 
framework  enables  us  to  declaratively  describe  the  potential  for 
non-deterministic  outcomes  in  Dedalus  programs.  It  also  allows 
us  to  identify  restricted  sub-languages  of  Dedalus  that  ensure  a 
model-theoretic  notion  of  confluence:  the  existence  of  a  unique 
ultimate  model  for  any  program  expressible  in  that  sub-language. 
The  next  question  then  is  to  identify  a  sub-language  that  ensures 
confluence  without  unduly  constraining  expressivity — both  in  terms 
of  both  computational  power,  and  the  ability  to  employ  familiar 
programming  constructs. 

One  natural  step  in  this  direction  is  to  drop  back  from  the  ex¬ 
pressive  power  of  Dedalus  to  a  monotonic  subset:  a  language  we 
call  Dedalus+  that  disallows  negation  of  IDB  predicates.  This  is 
inspired  in  part  by  the  CALM  theorem  [5,  24],  which  established 
a  connection  between  confluence  and  monotonicity;  subsequent 
formalizations  proved  that  monotonic  distributed  programs  are  in¬ 
deed  guaranteed  to  be  confluent  [1,8].  In  terms  of  expressivity, 
Immerman's  celebrated  result  regarding  the  collapse  of  the  fix- 
point  hierarchy  established  that  PTIME  is  captured  by  a  similar 
monotonic  language:  semipositive  Datalog  (Datalog-i  where  nega¬ 
tion  is  restricted  to  EDB  relations)  augmented  with  an  ordering 
over  the  universe  [25].  Put  together,  these  results  lead  to  a  rather 
startling  conclusion:  Dedalus+  shows  that  it  is  possible  to  express 
any  polynomial-time  distributed  algorithm  (surely  the  vast  majority 
of  useful  distributed  code!)  in  an  eventually  consistent  manner. 

This  result  is  intriguing  but  not  necessarily  useful.  In  particular, 
it  does  not  guarantee  that  Dedalus+  or  similar  monotonic  languages 
can  be  used  to  express  natural  declarations  of  programs.  Perhaps 
this  is  why,  despite  Immerman’s  complexity  results  over  25  years 
ago,  there  has  been  ongoing  interest  in  the  topic  of  negation  in  logic 
programs.  More  specifically,  we  have  found  that  Dedalus+  is  quite 
unnatural  to  use  in  many  cases  that  interest  us — we  demonstrate 
this  below  via  a  practical  system  component  (distributed  garbage 
collection)  that  is  easily  written  in  Dedalus,  but  would  be  quite 
convoluted  in  DedalusL 

Given  this  background,  we  seek  a  more  comfortable  balance  be¬ 
tween  expressive  power,  ease  of  programming  and  guarantees  of 
confluence.  We  achieve  this  via  a  controlled  use  of  negation  that 
draws  inspiration  from  both  stratified  negation  in  logic,  and  coordi¬ 
nation  protocols  from  distributed  computing.  We  present  a  language 


called  Dedalus5  whose  semantics  allow  negated  predicates,  but  sub¬ 
ject  to  a  closed-world  assumption  that  these  predicates  are  evaluated 
on  their  “complete”,  unchanging  state.  To  make  this  practical  in 
a  distributed  context,  we  then  show  that  an  operational  semantics 
for  Dedalus5  can  be  achieved  by  compiling  Dedalus5  programs 
into  stylized  Dedalus  programs  that  augment  the  original  code  with 
“coordination”  rules  to  establish  completion  of  subgoals  as  needed. 
The  operational  semantics  of  the  resulting  Dedalus  programs  are 
then  given  by  the  prior  literature.  The  result  is  a  sub-language  that 
retains  many  of  the  features  we  desire:  PTIME  expressivity,  guar¬ 
antees  of  confluence,  and  an  intuitive  and  familiar  use  of  negation. 
We  believe  the  result  is  practically  useful — indeed,  Dedalus5  corre¬ 
sponds  closely  to  Bloom,  a  practical  programming  we  have  used  to 
implement  a  broad  array  of  distributed  systems  [10], 

Our  technical  contributions  in  this  paper  include  the  following: 
(a)  the  definition  of  ultimate  models  as  a  declarative  framework  for 
assessing  outcomes  of  Dedalus  programs  and  the  undecidability  of 
confluence  for  Dedalus  programs  (Section  3),  (b)  the  introduction 
of  Dedalus+,  its  expressive  power,  and  proof  that  it  can  only  express 
programs  with  unique  ultimate  models  (Section  3.2),  (c)  the  intro¬ 
duction  of  Dedalus5  ,  examples  of  its  use,  and  model-theoretic  proof 
of  its  confluence  (Section  4,  4.1),  and  (d)  an  operational  semantics 
for  Dedalus5  achieved  via  a  compilation  of  Dedalus5  programs  into 
judiciously  “coordinated"  Dedalus  programs  (Section  4.2,  4.3). 

2.  Dedalus 

Dedalus  extends  Datalog  by  adding  spatial  and  temporal  at¬ 
tributes  to  every  relation.  The  critical  semantic  issue  from  dis¬ 
tributed  computing  that  we  wish  to  capture  is  non-determinism  of 
message  timing  across  nodes.  This  non-determinism  is  modeled 
simply  by  the  use  of  a  restricted  version  of  Sacca  and  Zaniolo’s 
choice  construct  [38],  We  use  the  stable  model  semantics  [21]  to 
assign  meanings  to  the  behaviors  of  Dedalus  programs  over  time. 
In  a  companion  paper  [3],  we  prove  these  stable  models  are  equiv¬ 
alent  to  traces  in  a  variation  of  the  network  transducer  model — an 
operational  formalism  for  distributed  systems — and  thus  argue  that 
Dedalus  is  a  reasonable  model  for  distributed  systems. 

We  believe  that  the  stable  model  semantics  are  inappropriate 
to  represent  the  output  of  a  distributed  system,  as  they  contain  a 
potentially  infinite  number  of  distinctions  that  are  not  meaningful 
in  an  “eventual”  sense.  Thus,  we  introduce  the  ultimate  model 
semantics  as  a  representation  of  program  output.  As  one  might 
imagine,  even  in  the  ultimate  model  semantics,  some  programs 
may  have  multiple  ultimate  models  that  correspond  to  meaningful 
distinctions  between  stable  models. 

We  begin  this  section  by  reviewing  the  syntax  of  Dedalus  first 
presented  in  Alvaro  et  al.  [6].  We  then  review  the  model-theoretic 
semantics  for  Dedalus  [3],  using  a  similar  exposition  for  simplicity. 

2.1  Syntax 

2.1.1  Preliminary  Definitions 

We  assume  an  infinite  universe  11  of  values.  We  assume  NcK.1 

A  relation  schema  is  a  pair  (R,  k)  where  R  is  a  relation  name  and 
k  its  arity.  We  also  write  ( R ,  k )  as  R{k\  A  database  schema  S  is  a 
set  of  relation  schemas.  Any  relation  name  occurs  at  most  once  in  a 
database  schema. 

As  in  Immerman  [25],  we  assume  the  existence  of  an  order:  every 
database  schema  contains  the  relation  schema  <(2).2  Later,  we  will 
explain  how  <  is  populated. 

‘N  =  [0,1,2,...] 

2We  will  often  write  <  in  infix  notation. 


A  fact  over  a  relation  schema  R(n)  is  a  pair  consisting  of  the 
relation  name  R  and  an  n-tuple  (ci, . . . , c„),  where  each  c,-  6  It.  We 
denote  a  fact  over by  R(C! ,  c„). 

A  relation  instance  for  relation  schema  Rln>  is  a  set  of  facts  whose 
relation  is  R.  A  database  instance  maps  each  relation  schema  R ,n) 
to  a  corresponding  relation  instance  for  Ru’\ 

A  rule  ip  consists  of  several  distinct  components:  a  head  atom 
head(ip),  a  set  pos(ip)  of  atoms,  a  set  neg(ip)  of  atoms,  a  set  of 
inequalities  ineq(<p)  of  the  form  x  <  y,  and  a  set  of  choice  operators 
cho(ip)  applied  to  the  variables.  Intuitively,  we  use  choice  operators 
to  model  real-world  non-determinism  due  to  network  asynchrony. 
The  elements  of  pos{ip)  are  called  positive  body  atoms,  and  the 
elements  neg(ip )  are  called  negative  body  atoms. 

The  conventional  syntax  for  a  rule  is: 

headfp)  <-  fu . . . ,  fn,->gu . . .  ,^gm,ineq{ip),cho{ip). 

where  f  6  pos(ip)  for  i  =  1 and  gt  6  neg(ip)  for  i  = 
1, . . . ,  m.  The  following  is  an  example  of  a  rule  over  schema  S  with 
ineq(ip)  =  0  and  cho(ip)  =  0. 

p(W)  <-  b,  (X))^_  ■  ■  ■  ,  b,(XD,  -c,(y7) . 

where  p,  b [,...,  b/,  Ci,  ... ,  c,„  are  relations  in  S,  and  W,  X,  and  Y j 
denote  a  tuple  (of  the  appropriate  arity)  consisting  of  constants  from 
11  or  variable  symbols. 

The  relation  name  <  may  only  appear  in  ineq\  in  particular,  <  may 
not  appear  in  any  atom  in  head,  pos,  or  neg. 

2.1.2  Safety 

Dedalus  maintains  the  usual  Datalog  safety  restrictions:  every 
variable  symbol  V  in  a  rule  must  appear  in  some  atom  in  pos. 

For  a  variable  symbol  V  that  appears  exactly  once  in  exactly  one 
neg  atom,  and  does  not  appear  elsewhere  in  the  rule,  there  is  a 
straightforward  rewrite  defined  in  Ullrnan  [41]  that  brings  the  rule 
into  compliance  with  the  safety  restriction.  An  example  of  the 
rewrite  appears  below. 

Example  1.  The  unsafe  rule:  p(X)  «—  q(X)  ,  -ir(X,Y)  . 

is  rewritten  into  the  following  two  rules  that  obey  the  safety 
constraint: 

p(X)  <—  q(X)  ,  -■r'CX)  . 

r'(X)  <—  r(X, Y)  . 

where  r'{1)  is  a  relation  schema  that  does  not  otherwise  appear  in 
the  program. 

For  readability,  we  will  use  the  underscore  symbol  (_)  to  represent 
a  variable  that  appears  only  once  in  a  rule. 

2.1.3  Spatial  and  Temporal  Extensions 

Given  a  database  schema  S,  we  use  S+  to  denote  the  schema 
obtained  as  follows.  For  each  relation  schema  r(")  6  (S  \  (<)), 
we  include  a  relation  schema  r"+1  in  ,S+.  The  additional  column 
added  to  each  relation  schema  is  called  the  location  specifier.  By 
convention,  the  location  specifier  is  the  first  column  of  the  relation. 
«S+  also  includes  <(2),  and  a  relation  schema  node1 11 :  the  finite  set 
of  node  identifiers  that  represents  all  of  the  nodes  in  the  distributed 
system.  We  call  S+  a  spatial  schema.3 

A  spatial  fact  over  a  relation  schema  R(">  is  a  pair  consisting  of 
the  relation  name  R  and  an  (n  +  l)-tuple  (d,  C\, . . . ,  cn)  where  each 
c,  s  11,  d  6  11,  and  node(rf).  We  denote  a  spatial  fact  over  R(l,) 
by  R(d ,  Ci ,  .  .  .  ,  c„) .  A  spatial  relation  instance  for  a  relation 

3The  terms  spatial  and  spatio-temporal  are  borrowed  from 
Ameloot  [7]. 


schema  r,")  is  a  set  of  spatial  facts  for  r,"+1).  A  spatial  database 
instance  is  defined  similarly  to  a  database  instance. 

Given  a  database  schema  S,  we  use  S *  to  denote  the  schema 
obtained  as  follows.  For  each  relation  schema  r('°  e  (S\  {<))  we 
include  a  relation  schema  r("+2)  in  S".  The  first  additional  column 
added  is  the  location  specifier,  and  the  second  is  the  timestamp.  By 
convention,  the  location  specifier  is  the  first  column  of  every  relation 
in  S *,  and  the  timestamp  is  the  second.  S *  also  includes  <(2)  (finite), 
node(I)  (finite),  time(1)  (infinite)  and  timeSucc'2'  (infinite).  We 
call  S *  a  spatio-temporal  schema. 

A  spatio-temporal  fact  over  a  relation  schema  R(n>  is  a  pair  con¬ 
sisting  of  the  relation  name  R  and  an  ( n  +  2)-tuple  (d.  t,c  1, . . . ,  c„) 
where  each  c;  6  U,  d  e  11,  t  e  11,  node(d),  and  time(t).  We 
denote  a  spatial  fact  over/?1'0  by  R(d,  t,  Ci ,  .  .  .  ,  c„). 

A  spatio-temporal  relation  instance  for  relation  schema  r(,,)  is  a 
set  of  spatio-temporal  facts  for  r('1+2).  A  spatio-temporal  database 
instance  is  defined  similarly  to  a  database  instance;  in  any  spatio- 
temporal  database  instance,  time'1'  is  mapped  to  the  set  containing 
a  time(t)  fact  for  all  t  e  N,  and  timeSucc12’  to  the  set  containing 
a  timeSucc (x,  y)  fact  for  all  y  =  x  +  1,  (x,y  6  N). 

We  will  use  the  notation  f@t  to  mean  the  spatio-temporal  fact 
obtained  from  the  spatial  fact  £  by  adding  a  timestamp  column  with 
the  constant  t. 

A  spatio-temporal  rule  over  a  spatio-temporal  schema  S *  is  a  rule 
of  one  of  the  following  three  forms: 

1.  A  deductive  rule  ip\ 

p(L ,  T  ,W)  <—  b,  (L  ,T^x7)  ,  ....  b/(L  ,T  ,xj)  , 

-,c1  (L,T,  Yj)  ,  ...,  -.c,„  (L ,  T,  Y,„)  ,  node(L)  , 
time(T),  ineq(ip). 

2.  An  inductive  rule  i p: 

p(L ,  S  ,W)  <—  bi  (L  ,  UX7)  ,  b/(L  ,T  ,XD  , 

-dCL.T.Yj),  _,c„,(L  ,T,  Y,„)  ,  node(L)  , 

time(T),  timeSucc(T,S)  ,  ineq(<p). 

3.  An  asynchronous  rule  </>: 

p(D,S,W)  ^biCL.T.xT),  b;(L  ,T  ,X/)  , 

-dCL.T.Y,),  -cm(L,T,Y,„),  node  (L) , 

time(T),  time(S) ,  choice((L,  T,  B),(S)), 
node  (D)  ,  ineqiip) . 

The  last  two  kinds  of  rules  are  collectively  called  temporal  rules. 

In  the  rules  above,  B  is  a  tuple  that  contains  all  of  the  distinct 
variable  symbols  in  X,,  . . . ,  X/,  Y,,  . . . ,  Ym.  The  variable  symbols 
D  and  L  may  appear  in  any  of  W,  X, ,  X/,  Yj,  .  .  .  ,  Y,„, 

whereas  T  and  S  may  not.  Head  relation  name  p  may  not  be  time, 
timeSucc,  or  node.  Relations  bi ,  ...,  b/,  Ci ,  ...,  c,„  may 
not  be  timeSucc,  time,  or  <. 

The  use  of  a  single  location  specifier  and  timestamp  in  rule  bod¬ 
ies  intuitively  corresponds  to  considering  deductions  that  can  be 
evaluated  at  a  single  node  at  a  single  point  in  time.  Inductive  rules 
use  the  timeSucc  relation  to  carry  the  results  of  deductions  into  the 
next  visible  timestep. 

Note  that  asynchronous  rules  are  the  only  kinds  of  rules  with 
cho  t  0.  The  choice  construct  is  from  Sacca  and  Zaniolo  [38], 
The  choice  ((X)  ,  (Y))  construct  represents  the  constraint  that 
the  variables  in  Y  be  functionally  dependent  on  the  variables  in  X. 
Due  to  variable  binding  restrictions,  only  asynchronous  rules  may 
have  a  different  value  for  the  head  location  specifier  than  the  body 
location  specifier.  Intuitively,  different  values  for  the  head  and  body 


location  specifiers  represents  cross-node  communication;  a  binding 
of  L,  T,  and  B  represents  a  message  being  sent  from  location  L  to 
location  D.  To  model  the  fact  that  the  network  may  arbitrarily  delay, 
re-order,  and  batch  messages,  any  single  value  of  head  timestamp  S 
is  permissible. 

We  use  the  causality  rewrite  of  Alvaro  et  al.  [3],  which  introduces 
the  following  causality  constraint:  a  message  sent  by  a  node  x  at 
local  timestamp  s  cannot  cause  another  message  to  arrive  in  the  past 
of  node  x  (i.e.,  at  a  time  before  local  timestamp  j-).4  Intuitively,  the 
causality  constraint  rules  out  models  corresponding  to  impossible 
executions,  in  which  effects  are  perceived  before  their  causes.  Full 
details  are  available  in  a  companion  paper  [3]. 

A  Dedalus  program  is  a  finite  set  of  causally  rewritten  spatio- 
temporal  rules  over  some  spatio-temporal  schema  «S*. 

2.1.4  Syntactic  Sugar 

The  restrictions  on  timestamps  and  location  specifiers  suggest  a 
natural  syntactic  sugar  to  improve  readability.  We  annotate  inductive 
head  relations  with  ©next  and  asynchronous  head  relations  with 
©async;  deductive  rules  have  no  head  annotation.  These  annotations 
allow  us  to  omit  the  boilerplate  usage  of  node,  time,  timeSucc 
and  choice  in  rule  bodies,  as  well  as  the  timestamp  attributes  from 
rule  heads  and  bodies.  We  also  omit  location  specifiers  by  default, 
but  refer  to  them  if  necessary,  as  described  later.  Using  this  syntactic 
sugar,  below  are  examples  of  the  three  kinds  of  rules  listed  above. 

Example  2.  Example  deductive,  inductive,  and  asynchronous 
rules. 

1.  Deductive: 

p(W)  <- bj  (XT)^  .  .  .  ,  bfW,  -CiCYO . 

_|C,„(Y,„)  ■ 

2.  Inductive: 

p(W)@next  <—  bi_(x7)  ,  ...,  b,(X)),  -^,(77) . 

“,c,„(Y,„)  . 

3.  Asynchronous: 

p(W) ©async  <— _bi  (X7)  ,  b,(X7),  -.0,(77),  ■■■, 

_,c,„(Yra)  . 

In  any  kind  of  rule,  the  body  location  specifier  can  be  accessed  by 
including  a  variable  symbol  or  constant  prefixed  with  #  as  any  body 
atom’s  first  argument.  In  asynchronous  rules  only,  the  head  location 
specifier  can  be  accessed  by  including  a  variable  symbol  or  constant 
prefixed  with  a  #  as  the  head  atom’s  first  argument.  The  example 
below  shows  an  example  of  #  in  an  asynchronous  rule. 

Example  3.  The  head  and  body  location  specifiers  are  bound  to 
D  and  L  respectively.  Note  how  D  may  appear  in  the  body,  L  may 
appear  in  the  head,  and  L  may  appear  duplicated  in  the  body. 

p  (#D ,  L ,  W)  ©async  <—  b  (#L ,  D ,  W)  ,  -.c  (#L ,  L)  . 

2.2  Semantics 

We  restrict  our  attention  to  Dedalus  programs  whose  deductive 
rules  are  syntactically  stratified. 

4Note  that  in  other  presentations  of  Dedalus  (e.g.,  [6]),  message 
timestamps  are  chosen  from  NUT,  where  T  represents  a  special 
value  indicating  that  the  message  was  dropped  by  the  network.  In 
this  paper,  we  assume  reliable  delivery  of  messages. 


An  input  schema  S'  for  a  Dedalus  program  P  with  spatio-temporal 
schema  S *  is  a  subset  of  P's  spatial  schema  S+.  Every  input  schema 
contains  the  node  relation;  we  will  not  explicitly  mention  the  pres¬ 
ence  of  node  when  detailing  an  input  schema.  A  relation  in  S'  is 
called  an  EDB  relation.  All  other  relations  are  called  IDB. 

An  EDB  instance  £  is  a  spatial  database  instance  that  maps  each 
EDB  relation  r  to  a  finite  spatial  relation  instance  for  r.  The  active 
domain  of  an  EDB  instance  £  for  a  program  P  is  the  set  of  constants 
appearing  in  £  and  P.  Every  EDB  instance  maps  the  <  relation  to  a 
total  order  over  its  active  domain. 

We  can  view  an  EDB  instance  as  a  spatio-temporal  database  in¬ 
stance  'K.  For  every  r(d, Ci ,  .  .  .  ,c„)  6  £,  the  fact 

r(d,t,C! ,  .  .  .  ,c„)  6  Vf  for  all  t  6  N.  Intuitively,  EDB  facts 
“exist  at  all  timesteps.” 

We  refer  to  a  Dedalus  program  together  with  an  EDB  instance 
as  a  Dedalus  instance.  The  semantics  of  a  Dedalus  program  can 
be  viewed  as  a  mapping  from  EDB  instances  to  spatio-temporal 
database  instances. 

Recall  that  choice  is  only  used  in  asynchronous  rules,  to  model 
the  fact  that  the  network  may  arbitrarily  delay,  re-order,  and  batch 
messages.  Sacca  and  Zaniolo  [38]  propose  the  stable  model  se¬ 
mantics  as  a  natural  interpretation  of  choice,  and  we  provide  the 
model-theoretic  details  elsewhere  [3].  Intuitively,  each  stable  model 
is  a  spatio-temporal  database  instance  that  defines  a  possible  func¬ 
tion  for  choice  that  obeys  the  causality  rewrite;  every  possible 
function  that  obeys  the  causality  rewrite  defines  a  stable  model.  In 
other  words,  each  different  causal  choice  of  timesteps  for  a  Dedalus 
instance  corresponds  to  a  different  stable  model  of  that  instance. 

Example  4.  Take  the  following  Dedalus  program  with  input 
schema  jqj.  Assume  the  EDB  instance  is  jnode(nl)  ,  q(nl)[. 

p(#L)@async  <—  q(#L)  . 

Let  the  power  set  of  X  be  denoted  ’P(A).  For  each  S  e  P(N  \  [0]), 
where  |S|  =  |N|,  the  following  is  a  stable  model: 

jnode(nl))  U  jp(nl ,  i)  |  i  e  5)  U  |q(nl , i)  |  i  e  N] 

These  are  the  only  stable  models  of  the  instance.  Since  q  is  part 
of  the  input  schema,  it  is  true  at  every  time.  Every  time  involves 
a  separate  choice  of  time  for  p,  which  must  be  later  than  time  0. 
Elements  S  of  the  power  set  with  finite  cardinality  are  ruled  out,  due 
to  the  causality  constraint  [3], 

2.2.1  Ultimate  Models 

The  stable  model  semantics  is  a  suitable  model-theoretic  charac¬ 
terization  of  the  behavior  of  a  program  in  that  there  is  a  correspon¬ 
dence  between  stable  models  and  traces  in  an  operational  formalism 
based  on  network  transducers  [3].  However,  stable  models  high¬ 
light  uninteresting  temporal  differences  that  may  not  be  “eventually” 
significant,  such  as  in  the  following  example: 

Example  5.  Take  the  following  Dedalus  program  with  input 
schema  jq).  The  program  determines  whether  two  values,  cl 
and  c2  “arrive”  at  the  same  time.  Assume  the  EDB  instance  is 
jnode(nl),  q(nl,cl), 
q(nl ,  c2) ). 

p(#L,X)@async  <—  q(#L,X)  ,  -ir(#L,X). 

r(X)@next  <—  q(X) . 

r(X)@next  <—  r(X) . 

concurrent ()  <—  p(nl , cl) ,  p(nl,c2). 

concurrent () ©next  <—  concurrent Q . 


For  each  s,  t  e  N,  the  following  is  a  stable  model: 

jq(nl,i,cl),  q(nl , i , c2) | i s  N)  U 
{node(nl),  p(nl,s,cl),  p(nl,t,c2))U 
jr(nl,i,cl),  r(nl ,  i  ,  c2)  |  i  s  N  \  {Ojj 
[concurrent (nl , i)  ieN  A  s  <  i)ifs  =  tu 

These  are  the  only  stable  models  of  the  instance.  Since  q  is  part  of 
the  input  schema,  q  facts  are  true  at  every  time.  By  the  rules,  r  facts 
are  true  at  every  time  except  time  8.  Thus,  there  is  only  one  choice 
of  head  timestamp  for  p  for  each  value  of  q’s  second  argument — this 
choice  corresponds  with  a  body  timestamp  of  8.  If  these  choices  are 
the  same,  then  concurrent  ()  is  true  at  all  timestamps  afterwards. 

However,  note  that  while  the  specific  values  of  s  and  t  are  unim¬ 
portant  in  terms  of  the  eventual  contents  of  the  concurrent  relation, 
there  are  different  stable  models  for  each  of  these  choices.  Intuitively, 
we  do  not  want  these  “intermediate”  temporal  behaviors  that  are  not 
eventually  significant,  to  differentiate  program  outputs. 

In  order  to  rule  out  such  behaviors  from  the  output,  we  will  define 
the  concept  of  an  ultimate  model  to  represent  a  program's  “output.” 

An  output  schema  for  a  Dedalus  program  P  with  spatio-temporal 
schema  S'  is  a  subset  of  P’s  spatial  schema  «S+.  We  denote  the 
output  schema  as  S°. 

Recall  that  a  stable  model  defines  a  spatio-temporal  database 
instance,  which  is  a  mapping  from  every  relation  r  in  S *  to  a  spatio- 
temporal  relation  instance  for  r,  which  itself  is  a  set  of  spatio- 
temporal  facts  for  r.  We  define  the  eventually  always  true  func¬ 
tion  <>□,  which  maps  a  spatio-temporal  database  instance  T  to 
a  spatial  database  instance  <>uT .  For  every  spatio-temporal  fact 
r  (p ,  t ,  Ci ,  .  .  .  ,  c„)  e  T,  the  spatial  fact  r(p ,  Ci ,  .  .  .  ,  c„)  s  OmT 
if  relation  r  is  in  S°  and  Vs .  (s  s  N  A  t  <  s)  => 
(r(p,s,C! ,  .  .  .  ,c„)  6  T). 

The  set  of  ultimate  models  of  a  Dedalus  instance  I  is  (<>□(7')  |  T 
is  a  stable  model  of  I).  Intuitively,  an  ultimate  model  contains  ex¬ 
actly  the  facts  in  relations  in  the  output  schema  that  are  eventually 
always  true  in  a  stable  model. 

Note  that  an  ultimate  model  is  always  finite  because  of  the  finite¬ 
ness  of  the  EDB,  the  safety  conditions  on  rules,  the  restrictions 
on  the  use  of  timeSucc  and  time,  and  the  prohibition  on  binding 
timestamps  to  non-timestamp  attributes.  A  Dedalus  program  only 
has  a  finite  number  of  ultimate  models  for  the  same  reason. 

Example  6.  For  Example  4  with  S°  =  (pj,  there  are  two  ultimate 
models:  j)  and  (p(nl)j.  The  latter  corresponds  to  an  element  of 
the  power  set  S  such  that  3* .  Vy .  (y  >  x)  =>  (y  s  5 ).  The  former 
corresponds  to  an  element  S  that  does  not  have  this  property. 

For  Example  5  with  S°  =  (concurrent  ()),  there  are  two  ulti¬ 
mate  models:  j)  and  (concurrentO).  The  former  corresponds  to 
choices  of  timestamp  for  cl  and  c2  that  are  not  equal,  whereas  the 
latter  corresponds  to  equal  choices  of  timestamp. 

3.  REFINING  Dedalus 

Dedalus  can  express  a  broad  class  of  distributed  systems  but  this 
flexibility  comes  at  a  cost.  As  we  have  shown,  a  Dedalus  program 
may  have  multiple  ultimate  models.  However,  it  is  often  desirable  to 
ensure  that  a  program  has  a  single,  deterministic  output,  regardless 
of  non-determinism  in  its  behavior. 

Having  defined  the  Dedalus  language,  we  will  refer  to  two  run¬ 
ning  examples  for  the  remainder  of  the  paper. 

Example  7.  A  simple  asynchronous  marriage  ceremony: 


groom_i_do()@async  <—  groom_i_do_edb() . 
bride_i_do()@async  <—  bride_i_do_edb() . 

runawayO  « - ibride_i_do()  ,  groom_i_do()  . 

runawayO  < - igroom_i_do()  ,  bride_i_do()  . 

runaway ()@next  <—  runawayO  ■ 
groom_i_doO@next  <—  groom_i_do()  . 
bride_i_do()@next  <—  bride_i_do() . 

In  a  classic  paper,  Gray  notes  the  similarity  between  distributed 
commit  protocols  and  marriage  ceremonies  [22].  For  simplicity 
(and  felicity),  Example  7  presents  a  simple  asynchronous  voting 
program  with  a  fixed  set  of  members:  a  bride  and  a  groom.  The 
marriage  is  off  (runawayO  is  true)  if  one  party  says  “I  do”  and  the 
other  does  not. 

However,  the  Dedalus  program  as  given  does  not  correctly  im¬ 
plement  such  a  vote.  Any  stable  model  where  groom_i_do()  and 
bride_i_do()  disagree  in  their  first  chosen  timestamps  yields  an 
ultimate  model  containing  runawayO.  By  contrast,  if  the  votes  are 
assigned  the  same  timestamp,  the  ultimate  model  does  not  contain 
runawayO.  In  operational  terms,  this  program  exhibits  a  race  con¬ 
dition:  when  the  EDB  contains  “I  do”  votes  from  both  parties,  the 
truth  value  of  runawayO  depends  on  the  (non-deterministic)  times 
at  which  their  messages  are  delivered. 

Example  8.  Distributed  garbage  collection: 
addr (Addr) @async  <—  addr_edb(Addr) . 
refers_to(#M,  Src,  Dst)@async «— 

local_ptr_edb(#N,  Src,  Dst) ,  master (#M) . 
refers_to(Src ,  Dst)@next <—  refers_to(Src,  Dst). 
reach(Src,  Dst)  <—  refers_to(Src ,  Dst). 
reach(Src,  Next)  <—  reach(Src ,  Dst), 
refers_to(Dst,  Next). 

garbage (Addr)  <—  addr(Addr) ,  root_edb(Root) , 

-•reach (Root,  Addr). 
garbage (Addr) ©next  <—  garbage (Addr) . 

Example  8  presents  a  simple  garbage  collection  program  for  a  dis¬ 
tributed  memory  system.  Each  node  manages  a  set  of  pointers  and 
forwards  this  information  to  a  central  master  node.  The  master 
computes  the  set  of  transitively  reachable  addresses;  if  an  address  is 
not  reachable  from  the  root  address,  it  can  be  garbage  collected.  For 
simplicity,  we  assume  that  each  node  owns  a  fixed  set  of  pointers, 
stored  in  the  EDB  relation  local_ptr_edb. 

This  more  complicated  example  suffers  from  the  same  ambiguity 
as  the  marriage  ceremony  presented  previously.  While  the  pro¬ 
gram  has  an  ultimate  model  corresponding  to  executions  in  which 
garbage  is  not  computed  until  the  transitive  closure  of  refers_to 
has  been  fully  determined  (i.e.,  after  all  messages  have  been  de¬ 
livered),  it  also  has  ultimate  models  corresponding  to  executions 
in  which  garbage  is  “prematurely”  computed.  When  garbage  is 
computed  before  all  the  refers_to  messages  have  been  delivered, 
there  is  a  correctness  violation:  reachable  memory  addresses  appear 
in  the  garbage  relation. 

Note  that  for  both  examples,  there  is  a  single  ultimate  model 
corresponding  to  the  execution  in  which  negation  is  not  applied  to 
a  set  until  the  content  of  the  set  has  been  fully  determined.  This 
“preferred”  model  is  akin  to  the  perfect  model  computed  by  a  cen¬ 
tralized  Datalog  evaluator  that  evaluates  rules  in  stratum  order  [41], 
applying  the  closed-world  assumption  to  relations  only  when  it  is 
certain  that  they  will  no  longer  change.  Unfortunately,  in  an  asyn¬ 
chronous  distributed  system  it  is  difficult  to  distinguish  the  absence 
of  a  message  (e.g.,  the  bride_i_do  or  some  expected  refers_to 
messages)  from  channel  delay.  Hence  both  programs  above  are 
underspecified  insofar  as  they  conclude,  as  soon  as  they  receive 


any  messages,  that  no  others  will  arrive.  In  practice,  a  programmer 
could  remediate  the  problem  by  augmenting  their  programs  with 
coordination  code  that  enforces  a  computation  barrier.  This  tech¬ 
nique  generally  entails  a  protocol  (e.g.,  voting  or  consensus)  that 
takes  place  between  all  communicating  agents  to  ensure  that  there 
are  no  outstanding  messages  in  flight. 

In  the  remainder  of  this  section,  we  explore  the  aspects  of  Dedalus 
that  allow  such  ambiguities  and  propose  a  restricted  language  Dedalus+ 
that  rules  them  out  (but  complicates  the  specification  of  programs 
like  our  examples  above).  In  Section  4,  we  consider  a  different 
language — Dedalus5 — that  allows  relatively  intuitive  program  spec¬ 
ifications  like  our  examples,  but  narrows  their  interpretation  to  a 
single,  “preferred”  model. 

3.1  Problems  with  Dedalus 

Definition  1.  A  Dedalus  program  is  confluent  if.  for  every  EDB 
instance,  it  has  a  single  ultimate  model.  A  program  that  is  not 
confluent  is  diffluent. 

Confluence  is  a  desirable,  albeit  conservative,  correctness  prop¬ 
erty  for  a  distributed  program.  A  program  that  is  confluent  produces 
deterministic  output  despite  any  non-deterministic  behaviors  that 
might  occur  during  its  execution.  For  example,  if  we  could  show  that 
a  data  replication  protocol  was  confluent,  we  could  prove  a  version 
of  the  commonly  desired  property  that  all  replicas  be  "eventually 
consistent”  after  all  messages  have  been  delivered  [40.  39].  Conflu¬ 
ence  may  be  viewed  as  a  specialization  of  the  more  general  notion 
of  consistency  of  distributed  state,  which  the  CALM  theorem  [24] 
argues  is  strongly  connected  with  the  model-theoretic  property  of 
logical  monotonicity. 

Unfortunately,  confluence  is  an  undecidable  property  of  Dedalus 
programs: 

Lemma  1.  Confluence  of  Dedalus  programs  is  undecidable. 

This  result  is  perhaps  not  surprising,  as  confluence  is  defined  over 
all  EDB  instances.  We  present  a  proof  in  the  appendix. 

Another  symptom  of  Dedalus  being  “too  big”  a  language  is  its 
expressive  power:  it  subsumes  PSPACE. 

Lemma  2.  Dedalus  subsumes  PSPACE. 

Proof.  We  show  how  to  write  the  PSPACE-complete  Quantified 
Boolean  Formula  (QBF)  problem  [20]  in  Dedalus.  Since  Dedalus 
is  closed  under  first-order  reductions  and  QBF  is  PSPACE-complete 
under  first-order  reductions,  we  have  that  PSPACE  C  Dedalus. 
Details  are  in  the  appendix.  □ 

3.2  Dedalus+ 

Distributed  programs  that  produce  non-deterministic  outputs  or 
have  runtimes  exponential  in  their  inputs  are  often  undesirable  in 
practice.  Since  checking  for  confluence  in  Dedalus  is  undecidable 
in  general,  we  may  instead  ask  whether  a  more  constrained  language 
will  exclude  such  undesirable  programs.  We  will  present  a  restriction 
of  Dedalus  that  allows  only  confluent  programs  and  prove  that  it 
captures  exactly  PTIME. 

Definition  2.  A  Dedalus  program  is  semipositive  if  the  -i  symbol 
only  appears  on  EDB  relations  in  the  program. 

Definition  3.  A  Dedalus  program  P  has  guarded  asynchrony  if 
for  every  relation  p  appearing  in  the  head  of  an  asynchronous  rule, 
the  program  P  has  a  rule  p  (X)  ©next  «—  p  (X)  . 

We  will  refer  to  the  language  of  semipositive  Dedalus  programs 
with  guarded  asynchrony  as  DedalusL 


3.2.1  Confluence 

To  show  that  all  Dedalus+  programs  are  confluent,  we  begin  by 
showing  that  Dedalus+  programs  are  temporally  inflationary :  if 
a  stable  model  of  a  Dedalus+  instance  contains  a  spatio-temporal 
fact  f@t,  then  it  also  contains  f@t+l  (and  thus  the  ultimate  model 
contains  f). 

Lemma  3.  Dedalus+  programs  are  temporally  inflationary. 

Proof.  Consider  a  derivation  tree  for  f@t:  a  finite  tree  of  instan¬ 
tiated  (variable-free)  rules,  where  negation  only  occurs  at  the  leaves. 
Note  that  the  instantiated  head  atom,  as  well  as  every  instantiated 
body  relation,  is  a  spatio-temporal  fact.  The  tree's  root  is  some 
instantiated  rule  with  f@t  in  its  head.  A  node  has  one  child  node  for 
each  body  fact:  the  child  node  contains  an  instantiated  rule  with  the 
fact  in  its  head — if  the  body  fact’s  relation  does  not  appear  in  the 
head  of  any  rule,  then  the  corresponding  node  contains  just  the  fact, 
and  is  a  leaf  node.  The  leaves  of  the  tree  are  instantiated  EDB  facts. 

For  the  moment,  we  assume  that  every  fact  has  a  unique  derivation 
tree.  Multiple  derivation  trees  are  easy  to  handle — simply  repeat  the 
following  process  for  each  tree. 

If  the  relation  of  f  is  EDB,  or  appears  in  the  head  of  an  asyn¬ 
chronous  rule,  then  the  lemma  holds  by  definition  of  DedalusL 
Assume  some  stable  model  contains  f@t  and  not  f@t+l.  Thus,  if 
the  rule  is  inductive  (resp.  deductive),  then  for  some  child  of  f@t, 
call  it  g@t-l  (resp.  g@t),  the  fact  g@t  (resp.  g@t+l)  is  not  in  the 
stable  model.  Inductively  proceed  down  the  tree,  at  each  step  going 
to  a  node  whose  relation  does  not  appear  in  the  head  of  an  asyn¬ 
chronous  rule.  However,  the  path  will  eventually  terminate  at  a  leaf 
node  providing  a  contradiction,  because  facts  at  leaf  nodes  are  either 
EDB  or  negated  EDB.  meaning  that  they  exist  at  all  timestamps, 
or  they  are  one  of  time,  timeSucc,  or  <,  which  also  exist  at  all 
timestamps.  □ 

A  consequence  of  temporal  inflation  is  that  all  Dedalus+  programs 
are  confluent. 

Theorem  1.  Dedalus+  programs  are  confluent. 

Proof.  Towards  a  proof  by  contradiction,  consider  a  Dedalus+ 
program  that  induces  two  ultimate  models  11\,112  for  some  EDB. 
Without  loss  of  generality,  there  must  be  a  spatial  fact  £,  such  that 
f  6  Hi  and  f  i  112. 

Recall  that  if  spatial  fact  f  is  in  some  ultimate  model,  then  for 
some  t0  e  N,  there  is  some  stable  model  that  contains  f@t  for  all  t 
>  t0. 

Consider  a  derivation  tree  for  f@t0  in  any  stable  model  that  yields 
111 .  Again,  for  simplicity,  we  assume  uniqueness  of  this  derivation 
tree.  For  some  child  of  f@t0,  call  it  g@s,  for  all  stable  models 
that  yield  112  there  is  no  r  such  that  g@r  is  in  the  stable  model  by 
Lemma  3.  Continue  traversing  the  tree,  at  each  step  picking  such  a 
g.  Eventually,  the  traversal  terminates  at  an  EDB  node,  leading  to  a 
contradiction.  □ 

Since  a  Dedalus+  program  has  a  unique  ultimate  model,  the 
specific  choice  of  values  for  timestamps  does  not  affect  the  ultimate 
model.  In  particular,  we  can  assume  that  the  timeSucc  of  the  body 
timestamp  is  always  chosen: 

Corollary  1.  Define  the  program  transformation  31{P)  to  be 
the  transformation  that,  converts  every  asynchronous  rule  tp  of 
Dedalus+  program  P  into  an  inductive  rule  by  undoing  the  causality 
and  choice  rewrites,  dropping  the  choice  operator,  and  adding 
timeSucc (T ,  S)  to  pos(ip).  Then,  the  ultimate  model  ofIH(P)  is 
the  same  as  the  ultimate  model  of  P. 


Of  course,  there  are  confluent  Dedalus  programs  not  in  DedalusL 
For  example: 

Example  9.  A  confluent  Dedalus  program  that  is  not  a  Dedalus+ 
program. 

b(#N,  I)@async  <—  b_edb(#L ,  I). 

b(I)@next  <—  b(I)  ,  -idequeued(I)  . 

b_lt(I,  J)  <—  b(I)  ,  b(J),  I  <  J. 

dequeued(I)@next  <—  b(I)  ,  -ib_lt(_,  I), 
b_lt(_, 

Any  instance  of  this  program  has  a  single  ultimate  model  in 
which  b()  (at  all  nodes)  contains  the  highest  element  in  b_edb() 
according  to  the  order  <.  Thus  it  is  confluent,  but  the  program  uses 
IDB  negation  and  does  not  have  guarded  asynchrony. 

3.2.2  Computational  Complexity 

Not  only  are  Dedalus+  programs  confluent,  but  they  also  capture 
exactly  PTIME.  We  will  prove  this  by  showing  an  equivalence  to 
semipositive  Datalog  programs,  which  are  known  to  capture  exactly 
PTIME  over  ordered  structures  [26]. 

First,  we  note  that  inductive  rules  in  Dedalus+  can  be  “converted” 
into  deductive  rules  without  affecting  the  ultimate  model. 

Lemma  4.  Define  the  program  transformation  I  (P)  in  the  follow¬ 
ing  way:  in  every  inductive  rule  of  Dedalus+  program  P — except 
any  basic  persistence  rule  for  a  relation  that  appears  in  the  head 
of  an  asynchronous  rule — remove  the  timeSucc  (T ,  S)  body  atom, 
and  replace  all  instances  of  the  variable  S  with  the  variable  T.  The 
ultimate  model  of  IIP)  is  the  same  as  the  ultimate  model  of  P. 

Proof.  Note  that  by  Lemma  3, 1 (P)  is  inflationary.  The  proof 
proceeds  similarly  to  the  proof  of  Lemma  3 — there  is  some  fact  in 
11 1  but  not  112\  we  consider  a  derivation  tree  for  this  fact  in  any 
stable  model;  it  must  be  the  case  that  some  child  fact  of  the  parent 
does  not  appear  in  any  stable  model  for  112  (by  Lemma  3).  We 
inductively  repeat  the  procedure,  and  discover  that  in  order  for  the 
fact  to  be  absent  from  11 1,  the  EDB  must  be  different,  which  is  a 
contradiction.  □ 

Theorem  2.  Dedalus+  captures  exactly  PTIME. 

Proof.  First  we  apply  Corollary  1  to  rewrite  asynchronous  rules 
as  inductive  rules.  Then,  we  convert  all  inductive  rules  into  deductive 
rules  using  Lemma  4.  Since  all  rules  are  deductive,  there  is  a  unique 
stable  model,  which  is  also  the  same  for  every  timestamp. 

Consider  removing  the  timestamp  attributes  from  all  relations, 
and  thus  the  time  relations  from  the  bodies  of  all  rules.  The  result  is 
a  Datalog  program  with  EDB  negation.  Its  minimal  model  is  exactly 
the  ultimate  model  of  the  single-timestep  Dedalus+  program. 

In  the  other  direction,  it  is  clear  that  we  can  encode  any  Datalog 
program  with  EDB  negation  in  Dedalus+  using  deductive  rules;  the 
ultimate  model  coincides  with  the  minimal  model  of  the  Datalog 
program.  □ 

4.  Dedalus5 

Returning  to  our  running  examples,  it  is  easy  to  see  that  neither 
program  is  directly  expressible  in  DedalusL  The  marriage  program 
from  Example  7  uses  IDB  negation  to  determine  the  truth  value  of 
runaway.  To  avoid  using  IDB  negation,  we  can  rewrite  the  program 
to  “push  down”  negation  to  the  EDB  relations  groom_i_do  and 
bride_i_do,  and  then  derive  the  runaway  IDB  relation  positively 
as  shown  in  Example  10.  While  the  rewrite  is  straightforward,  a 
majority  of  the  program’s  rules  need  to  be  modified.  Note  that  since 


Example  10  is  written  in  Dedalus+,  the  program  must  be  confluent; 
therefore,  it  is  not  subject  to  the  non-deterministic  output  observed 
for  the  original  marriage  program  (Example  7). 

Example  10.  An  asynchronous  marriage  ceremony  without  IDB 
negation: 

groom_i_dont()@async  < — igroom_i_do_edb() . 

bride_i_dontO@async  < - ibride_i_do_edb()  . 

runaway ()  <—  groom_i_dont() . 

runaway ()  <—  bride_i_dont() . 

runaway () ©next  <—  runaway () . 

groom_i_dont()@next  <—  groom_i_dont() . 

bride_i_dont()@next  <—  bride_i_dont() . 

The  garbage  collection  program  from  Example  8  is  likewise  out¬ 
side  Dedalus+  due  to  IDB  negation  but  it  presents  a  rather  more 
difficult  problem,  as  negation  must  be  pushed  down  through  recur¬ 
sion.  The  rules  for  positively  computing  the  negation  of  a  transitive 
closure  are  not  particularly  intuitive,  and  expressing  the  negation 
of  an  arbitrary  recursive  computation  is  even  more  difficult  [25], 
Furthermore,  the  best  known  strategies  involve  at  least  a  doubling 
in  the  arity  of  the  relations. 

In  general,  the  restriction  of  negation  to  EDB  relations  presents  a 
significant  barrier  to  expressing  practical  programs.  In  this  section, 
we  introduce  Dedalus5,  an  extension  of  Dedalus+  that  allows  a 
limited  form  of  IDB  negation  but  retains  the  benefits  of  Dedalus+: 
Dedalus5  also  captures  PTIME  exactly  and  allows  only  confluent 
programs.  We  show  that  Dedalus5  and  Dedalus+  are  equivalently 
expressive.  Then  we  provide  an  operational  semantics  for  Dedalus5  , 
based  on  the  one  for  Dedalus  [3],  inspired  by  coordination  protocols 
from  distributed  systems. 

4.1  Safe  IDB  Negation 

The  stratified  semantics  for  logic  programs  with  negation  is  both 
intuitive  and  corresponds  to  common  distributed  systems  practices: 
negation  is  not  applied  until  the  negated  relation  is  “done”  being 
computed. 

First,  we  define  a  predicate  dependency  graph  (PDG).  The  PDG 
of  a  Dedalus  program  P  with  spatio-temporal  schema  S *  is  a  di¬ 
rected  graph  with  one  node  per  relation;  each  node  i  has  a  label 
L(i).  If  node  i  represents  relation  p,  then  L(i)  =  p.  There  is  an  edge 
from  the  node  with  label  q  to  the  node  with  label  p  if  relation  p 
appears  in  the  head  of  a  rule  with  q  in  its  body.  If  some  rule  with 
p  in  the  head  and  q  in  the  body  is  asynchronous  (resp.  inductive), 
then  the  edge  is  said  to  be  asynchronous  (resp.  inductive).  If  some 
rule  with  p  in  the  head  has  ->q  in  its  body,  then  the  edge  is  said 
to  be  negated.  Collectively,  asynchronous  and  inductive  edges  are 
referred  to  as  temporal  edges.  The  PDG  does  not  contain  nodes  for 
the  node,  timeSucc,  or  time  relations,  or  any  relation  introduced 
in  the  causality  [3]  or  choice  [38]  rewrites. 

Dedalus'5  is  the  language  of  Dedalus  programs  with  guarded 
asynchrony  whose  PDG  does  not  contain  any  cycles  through  nega¬ 
tion.  As  is  standard,  a  Dedalus5  program  can  be  partitioned  into 
strata.  The  stratum  of  a  relation  r  is  the  largest  number  of  negated 
edges  on  any  path  from  r. 

Each  stratum  of  an  n-stratum  Dedalus5  program  can  be  viewed 
as  a  Dedalus+  program.  Stratum  i’s  program,  P,,  consists  of  all 
rules  whose  head  relation  is  in  stratum  i.  The  output  schema  of  P, 
contains  all  relations  in  stratum  i  +  1,  and  P,’s  EDB  contains  all 
relations  in  stratum  j  <  i.  Po’s  EDB  contains  all  EDB  relations. 
P„’s  output  schema  contains  all  relations  in  P's  output  schema. 

The  ultimate  model  of  a  Dedalus5  program  is  the  ultimate  model 
P„(...P,(Po(£'))...). 


Since  a  Dedalus5  program  is  a  straightforward  composition  of 
Dedalus+  programs,  we  can  apply  several  previous  results.  Note 
that  Dedalus5  programs  are  temporally  inflationary. 

Corollary  2.  Dedalus5  programs  are  confluent. 

Note  that  every  Dedalus+  program  is  a  Dedalus5  program,  and 
every  Dedalus5  program  has  a  constant  number  of  strata  in  the  size 
of  its  input.  Thus  we  have: 

Corollary  3.  Dedalus5  programs  capture  exactly  PTIME. 

Thus,  Dedalus5  maintains  the  desirable  properties  of  Dedalus+: 
it  is  both  confluent  and  PTIME. 

4.2  Coordination  rewrite 

While  the  model-theoretic  semantics  of  Dedalus5  are  clear,  its 
negation  semantics  are  different  than  those  of  Dedalus.  Thus,  we 
cannot  directly  apply  the  correspondence  to  a  distributed  operational 
semantics  in  Alvaro  et  al.  [3].  Fortunately,  we  can  rewrite  any 
Dedalus5  program  to  a  Dedalus  program. 

Given  a  Dedalus5  program  S ,  the  coordination  rewrite  P(S )  of  S 
is  the  Dedalus  program  obtained  by  adding  p_done()  to  the  body 
of  any  rule  in  S  that  contains  a  ->p  ( .  .  . )  atom  and  adding  rules  to 
define  p_done  ()  as  described  below. 

We  will  see  that  p_done  ()  has  the  property  that  in  any  stable 
model  At  if  p_done(l,t)  e  Al,  then  p_done(l,s)  e  At  for 
all  timestamps  s  >  t.  Furthermore,  if  p_done(l,t)  e  Al,  then 
p(l,s,Ci ,  .  .  .  ,c„)  6  Al  implies  that  p(l,t,Ci ,  .  .  .  ,c„)  s  Al 
for  all  timestamps  s  >  t.  Intuitively,  p_done()  is  true  when  the 
content  of  p  is  sealed  (henceforth  unchanging). 

We  will  present  a  specification  of  p_done()  after  introducing 
some  preliminary  definitions. 

A  collapsed  PDG  of  a  Dedalus  program  P  is  the  graph  obtained 
by  replacing  each  strongly  connected  component  of  the  PDG  of  P 
with  a  single  node  i,  such  that  L(i)  comprises  the  set  of  all  relations 
from  the  component.  If  a  strongly  connected  component  has  any 
asynchronous  edges,  we  call  the  resulting  collapsed  node  async 
recursive.  Each  node  in  the  collapsed  PDG  whose  label  contains  a 
relation  names  in  S°  is  called  an  output  node.  Note  that  a  collapsed 
PDG  is  acyclic. 

For  EDB  relations  p,  the  rule  for  p_done  is  p_done  ()  . 5  For 
IDB  relations,  defining  p_done()  takes  some  work.  Intuitively, 
p_done()  for  p  e  L(i)  directly  depends  on  r_done()  for  any  r  in 
the  body  of  a  rule  with  p  in  the  head.  Additionally,  asynchronous 
rules  take  some  care — while  deductively  defined  relations  are  done 
in  the  same  timestamp  as  all  relations  they  depend  on,  there  may  be 
arbitrary  delay  before  asynchronously  defined  relations  are  done. 

For  ease  of  exposition,  we  will  first  present  the  computation  of 
p_done  ()  for  p  in  non-async -recursive  nodes.  We  will  then  explain 
how  to  support  async  recursive  nodes.  We  assume  that  all  inductive 
rules  have  been  rewritten  to  deductive  rules  (Lemma  4). 

4.2.1  Non- Async -Re cursive  Nodes 

For  non-async-recursive  nodes,  we  can  compute  a  done  fact  for 
each  rule,  then  collate  these  into  done  facts  for  each  relation.  We 
handle  deductive  and  asynchronous  rules  separately.  The  done  fact 
for  a  deductive  rule  is  true  when  all  of  the  relations  in  the  body  of  the 
rule  are  henceforth  unchanging.  The  done  fact  for  an  asynchronous 
rule  is  henceforth  true  at  some  local  timestamp  after  all  facts  derived 
in  the  head  relation  are  true  at  their  respective  locations.  We  assume 
guarded  asynchrony  applies  to  the  rules  in  this  section. 

5This  expression  is  actually  a  rule.  Consider  the  unsugared  form: 
p_done(L,T)  <—  node(L),  time(T). 


Let  i  be  a  non-async -recursive  node.  Repeat  the  following  for 
each  element  of  p  6  L(i).  Assume  the  rules  in  P  with  head  relation 
p  are  numbered  1 ,ip.  The  rule  for p_done ()  is: 

p_done()  <—  redone  O  ,  .  .  .  ,  r,  done()  . 

Let  the  nodes  in  the  collapsed  PDG  connected  via  incoming  edges 
to  node  i  be  denoted  by  E(i).  Let  the  relations  L(k)  be  named 

For  each  rule  1  <  j  <  ip  in  P  with  head  relation  p,  if  j  is: 
Deductive:  Add  the  rule: 

r,-_done()  <-  pi_done() . Pi?_done()  . 

Asynchronous:  For  each  asynchronous  rule: 

p(#N,W)@async  «— b!  (tfL.Xj)  ,  b/(#L,X;), 

-nC!  (#L,Y7)  ,  ....  -.cm(#L,Y^). 
add  the  following  set  of  rules: 

p;_to_send(N,W)  «— b[  (#L  ,X!)  ,  b/(#L,X;), 

-.Ci(#L,Y7),  ....  . 

p;_to_send_done()  <—  b,_done()  ,  .  .  .  ,  b,_done()  , 
Ci_done(),  c„,_doneO. 

p;_send(#N,L,X) ©async  <—  Pj_to_send(#L,N,X)  . 
p;_ack(#N,L,X) ©async  <—  p;_send(#L,N,X)  . 
r;_done_node(#L,N) ©async  <—  p!_done(#N)  ,  .  .  .  , 
p,,_done(#N)  ,  (vx.pJ_to_send(#N,L,X)  => 
Pj_ack(#N,L,X)). 

r;_done()  <—  (VN.node(N)  =>  r,_done_node(N)) . 

The  first  rule  stores  messages  to  be  sent  at  the  body  (source’s) 
location  specifier,  so  the  source  can  check  whether  all  messages  have 
been  acknowledged.  The  original  destination  location  specifier  is 
stored  as  an  ordinary  column  in  the  Pj_to_send  relation  (indicated 
by  the  absence  of  #).  Note  that  because  this  first  rule  is  a  deductive 
rule,  as  well  as  the  only  rule  defining  p;_to_send,  the  p j_to_send 
relation  is  done  at  the  same  time  as  the  body  relations  of  the  first  rule, 
as  shown  in  the  second  rule.  The  third  rule  copies  messages  to  the 
correct  destination  location  specifier,  while  including  the  location 
specifier  of  the  source  (L).  The  fourth  derives  acknowledgments  at 
the  source’s  location  specifier.  The  fifth  rule  (at  the  source)  derives 
a  rj_done_node  fact  at  a  node  when  the  source  has  an  p;_ack  for 
each  Pj_send.  Note  that  the  causality  constraint  ensures  that  the 
timestamp  chosen  for  each  r,_done_node  message  is  greater  than 
any  timestamp  before  the  stable  model  satisfies  the  body  of  the  rule. 
The  final  rule  (at  the  destination)  asserts  that  rule  j  is  done  once 
r;_done_node  has  been  received  from  all  nodes — intuitively,  the 
rule  is  done  when  all  messages  from  all  nodes  have  been  received. 

The  formula  VX.<^( W.  X)  where  <p( W.  X)  is  of  the  form  p  (W ,  X)  => 
q  (W ,  X)  translates  to  f or al 1^  (W) ,  and  the  following  rules  are  added: 

P0_min(W,X)  <-  p(W,X)  ,  -ip^_succ(W,I,X)  , 
p<s_succ_doneQ  . 

p^Jiax (W ,  X)  <—  p (W ,  X)  ,  -ip^_succ  (W ,  X ,  I)  , 
p<s_succ_done()  . 

p^_succ(W,X,Y)  <—  p(W,X) ,  p(W,Y) ,  X  <  Y, 

-ip^_not_succ  (W ,  X ,  Y)  ,  p^_not_succ_done  ()  . 
P0_not_succ(W,X, Y)  «—  p(W,X) ,  p(W,Y),  p(W,Z), 

X  <  Z,  Z  <  Y. 

forall^_ind(W,X)  <—  p^jiinCW.X) ,  q(W,X) . 
forall^_ind(W,X)  <—  forall^indfW,  Y) , 
p<s_succ(W, Y,X)  ,  q(W,X)  . 
forall^fW)  <—  £orall^_ind(W,X) ,  p^_max(W,X)  . 

The  first  four  rules  above  compute  a  total  order  over  the  facts 


in  p<j.  The  final  three  rules  iterate  over  the  total  order  of  p^,  and 
checking  each  p<j  to  see  if  q  also  holds.  If  q  does  not  hold  for  any  p, 
iteration  will  cease.  Flowever.  if  q  holds  for  all  p  then  forall^  is 
true. 

We  additionally  need  to  add  a  rule  for  the  vacuous  case  of  the 
universal  quantification.  In  general,  we  cannot  write  forall^CW) 
<—  -ipCW,!),  p_done()  .,  because  the  variables  in  W  do  not  obey 
our  safety  restrictions.  Thus,  for  every  rule  r  that  contains  VX.0(W,  X) 
in  its  body,  we  must  duplicate  r,  replacing  the  V  clause  with  the 
atom  ->p  (W ,  1) . 

Note  also  that  we  are  abusing  notation  for  the  <  relation.  We 
previously  defined  <  as  a  binary  relation,  but  it  is  easy  to  define  a 
2/?-ary  version  of  <  that  encodes  a  lexicographic  ordering  over  n-ary 
relations.  Flere,  we  use  <  to  refer  to  the  latter. 


4. 2. 2  Async  Recursive  Nodes 

The  difficulty  with  a  relation  p  in  an  async  recursive  node  is  that 
r  is  done  when  all  messages  have  been  received  in  the  node,  and 
all  messages  have  been  received  if  p  is  done.  To  circumvent  this 
circular  dependency,  we  introduce  a  specialized  two-phase  voting 
protocol. 

Consider  an  async  recursive  node  i. 

Let  the  asynchronous  rules  with  head  relations  in  L(i)  be  num¬ 
bered  1 ,ip.  Add  the  rule: 

all_ack,  0  <—  redone  0  ,  .  .  . ,  r(j  _done()  . 

For  each  rule  j,  add  the  rules  for  asynchronous  rules  in  the  previ¬ 
ous  section,  except  for  the  last  two  rules.  Instead  write: 

r j_not_done O  <—  Pj_to_send(X)  ,  -ip;_ack(X)  . 
r;_done()  <—  -ir;_not_done  O  . 

We  perform  a  two-round  voting  protocol  among  the  nodes;  the 
node  with  the  minimum  identifier  is  the  master.  We  assume  that 
guarded  asynchrony  does  not  apply  to  the  relations  that  appear 
in  the  head  of  any  asynchronous  rule  in  the  following  protocol. 
The  rules  shown  below  begin  the  first  round  of  voting.  Nodes 
vote  complete_l,-  if  all_ack,  is  true — intuitively,  if  they  have  no 
outstanding  unacknowledged  messages.  Votes  are  sent  to  the  node 
with  minimum  identifier  (the  master). 

not_node_min(Ll)  «—  node(Ll) ,  node(L2),  L2  <  LI. 
node_min(L) <—  -mot_node_min(L) ,  node(L) . 
start_round_l,0  <—  node_min(#L,L)  ,  -iround_l,()  . 
round_  1,  () ©next  <—  start_round_l,  ()  . 
round_  1,  () ©next  <—  round_l,  ()  ,  -istart_round_2,  ()  ■ 
vote_l;(#N) ©async  <—  start_round_l,()  ,  node(N)  . 
complete_l,(#M,N)@async  <—  vote_l,-(#N)  , 
al  l_ack;  (#N)  ,  node_min  (#N ,  M)  . 
incomplete_l,(#M,N)@async  <—  vote_l,(#N)  , 

-iall_ack,  (#N)  ,  node_min(#N ,  M)  . 

To  persist  votes  until  round  1  begins  again,  the  following  rules 
are  instantiated  for  k  =  1  and  2. 

complete_k,  (N)@next  <—  complete_k,-(N)  , 
-istart_round_l,  ()  . 

incomplete_k,  (N)@next  <—  incomplete_k,  (N)  , 
-istart_round_l,  ()  . 

To  count  votes,  we  assume  the  following  rules  are  instantiated  for 
k  =  1  and  2.  Round  1  is  restarted  if  some  node  votes  incomplete_l, 
in  round  1 — i.e.,  it  has  an  outstanding  unacknowledged  message  - 
or  incomplete_2,  in  round  2. 


recv_k,(N)  «—  complete_k,(N)  . 
recv_k,(N)  <—  incomplete_k;(N)  . 
not_all_recv_k;()  <—  node(N)  ,  -irecv_k;(N)  . 
not_all_comp_k;()  <—  node(N),  -icomplete_k,(N)  . 

start_round_l,()  « - inot_all_recv_k,()  , 

not_all_comp_k,()  . 

Once  a  node  has  received  a  vote_l,  vote  solicitation,  it  also 
begins  keeping  track  of  whether  it  has  sent  any  messages  in  the 
async  recursive  component;  this  information  is  erased  if  another 
vote_l,  solicitation  is  received.  The  causality  constraint  implies 
that  -iall_ack,  ()  is  true  if  a  message  is  sent,  because  messages 
cannot  be  instantly  acknowledged. 

sent,  ()  <■ - iall_ack,  ()  . 

sent,  ()@next «—  sent,()  ,  -ivote_l,0  . 

Round  2  is  started  by  the  master  if  no  node  has  an  outstanding 
message. 

start_round_2,  0  < - inot_all_recv_l,()  , 

-inot_all_comp_l,-0  ,  node_min(#L,L)  . 

The  voting  for  round  2  is  shown  below.  Notes  vote  incomplete_2, 
if  they  have  sent  any  messages  since  the  last  vote_l,  solicitation. 
Recall  that  any  incomplete_2,  votes  result  in  the  protocol  restart¬ 
ing  with  round  1 . 

vote_2;(#N) ©async  «—  start_round_2;()  ,  node(N)  . 
complete_2i(#M,N) ©async  <—  vote_2,(#N)  , 

all_ack,  (#N)  ,  -isent,(#N),  node_min(#N,M)  . 
incomplete_2,(#M,N)  ©async  <—  vote_2,(#N)  , 
sent,(#N),  node_minC#N,H)  . 

The  entire  async  recursive  node  i  is  done  when  all  nodes  have 
voted  complete_2,. 

done_recursion,  ()  <—  -inot_all_recv_2,()  , 
-inot_all_comp_2,  0  . 

For  every  relation  p  e  L(i),  add  the  rule: 
p_done()  <—  done_recursion,  ()  . 

4.3  Equivalence  of  coordination  rewrite  to 

Dedalus5 

We  first  argue  that  the  rules  for  computing  p_done  have  the 
desired  effect. 

Lemma  5  (Sealing).  Assume  a  Dedalus4  program  S  with  rela¬ 
tion  p.  The  Dedalus  program  P(S )  contains  a  relation  p_done 
with  the  following  property:  in  any  of  its  stable  models  M,  if 
p_done(l ,  t)  s  At  then  p_done(l ,  s)  e  M  for  all  timestamps 
s  >  t.  Furthermore,  ifp_done  (l,t)  e  At  then  p(l ,  s ,  ct ,  .  .  .  ,  cn)  6 
M  implies  that  p(l,t,Ci,  ...  ,c„)  €  Mfor  all  timestamps  s  >  t. 

Proof.  We  assume  that  Pi_done()  ,  .  .  .  ,  p,9_done()  have 
the  properties  mentioned  in  the  Lemma. 

Clearly,  p_done  ()  has  the  properties  mentioned  in  the  Lemma 
for  the  deductive  case. 

In  the  asynchronous  case,  p_done()  is  similarly  correct;  the 
causality  constraint  implies  that  the  timestamp  on  acknowledgments 
is  later  than  the  timestamp  on  the  facts  they  acknowledge,  and  thus 
the  timestamp  on  each  node’s  r^_node_done  fact  is  greater  than 
the  timestamp  on  the  acknowledged  facts.  Thus,  before  a  node 
concludes  p_done  () ,  that  node  has  all  p  facts. 

In  the  asynchronous  recursive  case,  the  causality  constraint  en¬ 
sures  that  every  response  in  the  second  round  is  received  at  a  time 
greater  than  every  response  in  the  first  round.  Thus,  between  at 
least  the  last  response  of  the  first  round  and  the  last  response  of 
the  second  round,  no  node  has  outstanding  messages  and  no  node 


sends  a  message.  This  implies  that  no  node  ever  sends  a  message 
again.  □ 

The  above  Lemma  implies  that  the  ultimate  model  of  Dedalus4 
program  5  is  the  same  as  the  ultimate  model  of  Dedalus  program 
P(S ),  as  relations  in  lower  strata  are  complete  before  higher  strata 
rules  are  satisfiable. 

4.4  Discussion 

Applying  the  program  transformation  P  to  the  garbage  collection 
program  from  Example  8  results  in  the  addition  of  the  following 
rules. 

Example  11.  Synthesized  rules  for  the  garbage  collection  pro¬ 
gram: 

refers_to_to_send(M,  Src,  Dst)  <— 

local_ptr_edb(N,  Src,  Dst),  master(H). 
refers_to_send(#M,  L,  Src,  Dst)@async <— 

refers_to_to_send(#L,  M,  Src,  Dst). 
refers_to_ack(#N,  L,  Src,  Dst)  <— 

refers_to_send(#L ,  N,  Src,  Dst). 
refers_to_done_node(#M,  N)@async  <— 

local_ptr_edb_done(#N) ,  master(#N,  M) , 
(VX.refers_to_to_send(#N,  M,  X)  => 
refers_to_ack(#N,  M,  X)). 
refers_to_done(M)  <—  (VN.node(N)  => 
refers_to_done_node(M,  N)). 
reach_done()  <—  refers_to_done() , 

(VN.node(N)  =>  local_ptr_edb_done(N))  . 

One  rule  from  the  original  program  must  also  be  rewritten  to 
include  the  new  subgoal  reach_done: 

Example  12.  Garbage  collection  rewrite 
garbage (Addr)  <—  addr_edb(Addr) ,  root_edb(Root) , 
-•reachCRoot,  Addr),  reach_done(). 

As  we  have  shown,  the  resulting  program  has  a  single  ultimate 
model.  This  model  corresponds  exactly  with  one  of  the  ultimate 
models  of  the  original  program  from  Example  8:  the  model  in 
which  -ireach  is  not  evaluated  until  reach  is  fully  determined.  The 
rewrite  has  effectively  forced  an  evaluation  strategy  analogous  to 
stratum-order  evaluation  in  a  centralized  Datalog  program. 

Note  also  that  the  rewrite  code  is  a  generalization  of  the  “coordina¬ 
tion”  code  that  a  Dedalus  programmer  could  have  written  by  hand  to 
ensure  that  the  local  relation  refers_to  is  a  faithful  representation 
of  global  state.  In  distributed  systems,  global  computation  barriers 
are  commonly  enforced  by  protocols  based  on  voting:  the  two-phase 
commit  protocol  from  distributed  databases  is  a  straightforward  ex¬ 
ample  [23].  In  the  synthesized  protocol  shown  above,  every  agent 
responsible  for  a  fragment  of  the  global  state  must  “vote”  that  ev¬ 
ery  message  they  send  to  the  coordinator  has  been  acknowledged. 
The  coordinator  must  tally  these  votes  and  ensure  that  the  vote  is 
unanimous  for  all  agents.  If  the  protocol  completes  successfully,  the 
coordinator  may  proceed  past  the  barrier. 

An  explicit  goal  of  our  work  with  Dedalus  has  been  to  view 
general  distributed  systems  through  a  model-theoretic  lens.  From 
this  perspective,  the  connection  between  coordination  protocols 
that  enforce  barriers  and  stratified  evaluation  of  logic  programs  is 
clear.  Indeed,  global  stratification  requires  a  coordination  protocol 
to  ensure  a  global  consensus  on  set  completion  before  negation  is 
applied. 


5.  RELATED  WORK 

Dedalus  shares  features  with  a  long  history  of  deductive  database 
systems.  The  purely  declarative  semantics  of  Dedalus,  based  on  the 
reification  of  logical  time  into  facts,  are  closer  in  spirit  and  interpre¬ 
tation  to  Statelog  [29]  and  the  languages  proposed  by  Cleary  and 
Liu  [15,  30,  33]  than  to  languages  that  admit  procedural  semantics 
to  deal  with  update  and  deletion  over  time  [12.  16].  Previous  work 
in  temporal  deductive  databases  attempted  to  compute  finite  repre¬ 
sentations  for  periodic  phenomena  [13]:  we  reuse  many  of  these 
results  in  Dedalus. 

Significant  recent  work  ([4,  9,  14,  32])  has  focused  on  applying  de¬ 
ductive  database  languages  extended  with  networking  primitives  to 
the  problem  of  specifying  and  implementing  network  protocols  and 
distributed  systems.  Theorem  1  resembles  the  correctness  proof  of 
“pipelined  semi-naive  evaluation”  for  distributed  Datalog  presented 
by  Loo  et  al.  [31].  In  general,  however,  the  language  extensions 
proposed  in  much  of  this  prior  work  added  expressivity  and  domain 
applicability  but  compromised  the  declarative  semantics  of  Datalog, 
making  formal  analysis  difficult  [35,  36],  In  designing  Dedalus, 
we  tried  to  recover  and  extend  the  model-theoretic  analyses  appli¬ 
cable  to  pure  Datalog,  while  preserving  the  features  appropriate  to 
modeling  loosely  coupled  distributed  systems. 

Specification  languages  such  as  TLA  [28]  and  I/O  Automata  [34] 
employ  first-order  logic  and  set  theory  to  model  and  prove  properties 
about  distributed  systems,  and  a  subset  of  both  languages  produce 
executable  code.  Like  Dedalus,  TLA  expresses  concurrent  systems 
in  terms  of  constraints  over  valuations  of  state,  and  temporal  logic 
that  describes  admissible  transitions.  Dedalus  differs  from  TLA  in 
its  minimalist  use  of  temporal  constructs  (@next  and  @async),  and 
in  its  model-theoretic  semantics.  I/O  Automata  model  distributed 
systems  at  a  lower  level  than  Dedalus,  as  a  composition  of  state 
machines  with  explicitly  specified  transition  systems.  We  intend 
to  further  explore  the  relationship  of  Dedalus  to  these  traditional 
distributed  systems  formalisms. 

Recently,  Ameloot  et  al.  explored  Hellerstein’s  CALM  theorem 
using  relational  transducers  [8],  They  proved  that  monotonic  first- 
order  queries  are  exactly  the  set  of  queries  that  can  be  computed 
in  a  coordination-free  fashion  in  that  transducer  formalism.  Their 
work  uses  some  different  assumptions  than  ours — for  example,  they 
assume  that  all  messages  sent  by  a  node  are  multicast  to  a  fixed  set 
of  neighbors,  whereas  Dedalus  permits  arbitrary  unicast.  Relational 
transducers  have  also  been  used  to  specify  and  show  the  correctness 
of  interactive  web  services  and  electronic  commerce  workflows 
(e.g„  [2,  17,  18]). 

Abiteboul  et  al.  recently  proposed  Webdamlog  [1],  another  dis¬ 
tributed  variant  of  Datalog  that  bears  many  similarities  to  Dedalus. 
They  demonstrate  that  Webdamlog  has  an  operational  semantics 
similar  to  the  operational  semantics  in  Dedalus  [3],  and  provide  con¬ 
servative  conditions  for  confluence  based  on  a  variant  of  (node-local) 
stratification.  Our  work  additionally  provides  a  model-theoretic  se¬ 
mantics  for  Dedalus5  that  corresponds  to  the  operational  semantics. 
Dedalus5  programs  (which  are  guaranteed  to  be  confluent)  also 
admit  a  broader  use  of  negation — ensured  via  a  synthesized  coordi¬ 
nation  protocol — than  the  stratification  conditions  of  Webdamlog. 

6.  FUTURE  WORK 

An  obvious  topic  for  future  work  is  to  extend  beyond  our  focus  on 
“one-shot”  executions  over  fixed  inputs.  Some  distributed  computa¬ 
tions  are  continuous  services  whose  semantics  need  to  be  described 
with  respect  to  subsets  or  subsequences  of  their  inputs  and  outputs. 
To  this  end,  models  from  stream  queries  may  be  useful  (e.g.,  [11]). 
Our  network  model  makes  similar  assumptions  of  finiteness — in 


particular  we  have  ignored  dropped  messages  (infinite  delays)  and 
the  standard  practice  of  timeout  logic  for  dealing  with  them.  In 
our  applied  work  [4,  5]  we  have  modeled  timeouts  as  messages 
that  arrive  asynchronously  under  the  control  of  an  external  “clock” 
agent.  Programs  that  reason  about  timeouts  typically  “seal”  the 
contents  of  IDB  relations  based  on  the  inherently  non-deterministic 
subset  of  messages  that  “beat  the  clock.”  It  would  be  interesting  to 
characterize  a  useful  family  of  ultimate  models  in  such  programs 
without  resorting  to  the  full  power  of  Dedalus. 

Concurrent  with  this  research,  our  team  has  been  developing 
a  practical  language  for  implementing  distributed  systems  called 
Bloom  [10].  Bloom  has  built-in  support  for  input  streams,  including 
“periodic”  relations,  in  which  tuples  appear  at  regular  (wall  clock) 
intervals  and  which  are  the  basis  of  timeout  logic.  Instead  of  relying 
on  language  restrictions  like  those  presented  in  this  paper.  Bloom 
offers  the  full  power  of  Dedalus.  However,  we  use  the  intuition 
of  Dedalus5  to  motivate  a  (necessarily)  conservative  static  analy¬ 
sis  for  confluence  of  Bloom  programs.  The  analysis  can  mark  a 
program  as  confluent  if  it  is  only  uses  the  constructs  of  DedalusL 
Otherwise,  the  analysis  alerts  the  programmer  to  uses  of  negation 
(and  aggregation)  that  are  applied  over  asynchronously-delivered 
messages  or  their  consequences.  The  programmer  must  then  manu¬ 
ally  "guard”  these  negative  constructs  with  coordination  logic,  and 
manually  verify  the  confluence  of  the  result.  This  allows  program¬ 
mers  to  choose  from  (and  implement)  a  wide  variety  of  coordination 
protocols,  as  opposed  to  our  approach  here  in  which  a  compiler 
synthesizes  a  simple,  generic  protocol.  In  practice,  the  performance 
tradeoffs  between  these  protocols  can  be  substantial,  depending  on 
the  execution  environment. 

As  future  work,  it  would  helpful  to  formally  characterize  these 
practical  tradeoffs,  and  automatically  synthesize  efficient  and  prov- 
ably  confluent  coordination  logic  suited  to  the  environment.  The 
observation  that  two  programs  have  the  same  complexity  in  a  Tur¬ 
ing  Machine  model  does  not  mean  they  have  similar  network  per¬ 
formance  characteristics  in  the  operational  semantics  of  network 
transducers.  We  are  pursuing  work  on  a  complexity  model  that  will 
address  this. 
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APPENDIX 

A.  PROOF  OF  FEMMA  1 

Proof.  Using  the  construction  presented  by  Gaifman  et  al.  [  19],  it 
is  possible  to  write  a  Datalog  program  that  encodes  any  two-counter 
machine’s  transition  relation  and  an  arbitrarily  long  finite  successor 
relation  in  the  EDB,  and  define  a  0-ary  output  relation  accept  that  is 
true  if  and  only  if  the  two-counter  machine  accepts  and  the  transition 
and  successor  relations  are  valid.  As  the  construction  is  possible  in 
Datalog,  it  is  also  possible  in  Dedalus. 

We  add  the  following  rules  to  the  construction,  to  non-deterministically 
decide  whether  to  run  the  machine  or  not: 

message (0) @async . 

message (l)@async. 

run_machine() <—  message(O)  ,  message(l) . 

acceptO  <—  message(O)  ,  -imessage(l)  , 
input_valid()  . 

accept()  * - imessage(S)  ,  message(l)  , 

input_valid()  . 

Note  that  the  first  two  lines  are  actually  rules. 

For  valid  inputs,  the  ultimate  model  is  acceptO  if  and  only  if 
either  message (0)  and  message(l)  are  assigned  the  same  times¬ 
tamp  and  the  machine  accepts,  or  if  the  timestamps  are  different. 

For  invalid  inputs,  all  ultimate  models  are  empty. 

If  we  could  decide  confluence  for  this  program,  we  could  decide 
whether  there  is  any  valid  input  for  which  an  arbitrary  two-counter 
machine  halts  in  an  accepting  state.  □ 

B.  QBF  IN  Dedalus 


We  assume  that  the  QBF  formula  is  in  prenex  normal  form: 
<2i*i  £?2*2  •  •  •  Qnx„(x\, . . . , x„).  The  textbook  recursive  algorithm 
for  QBF  [20]  involves  removing  Qt  and  recursively  calling  the  al¬ 
gorithm  twice,  once  for  Ft  =  Q2X2  ■  .  .  Q„xn( 0,  X2, . . .  x„ )  and  once 
for  Ft  =  Q2X2  ■  .  -  Qnx„(l,  X2,  ■  ■  ■ ,  xn)  for  x\.  If  <2i  =  3,  then  the 
algorithm  returns  F\  V  F2;  if  Q\  =  V,  then  Fj  A  F2. 

The  leaves  of  the  tree  of  recursive  calls  can  each  be  represented 
as  an  /z-bit  binary  number,  where  bit  i  holds  the  value  of  x, .  Assume 
the  left  child  of  a  node  at  depth  i  of  the  recursive  call  tree  represents 
binding  xt  to  0,  and  the  right  child  1 . 

Our  algorithm  is  intuitively  similar  to  a  postorder  traversal  of  this 
recursive  call  tree.  Recursively,  first  visit  the  left  node,  then  visit 
the  right  node,  then  visit  the  root.  If  we  are  visiting  a  leaf  node,  we 
evaluate  the  formula  for  the  given  variable  binding  and  store  a  0  or 
1  at  the  node  depending  on  whether  the  formula  is  false  or  true  for 
that  particular  binding.  If  we  are  visiting  a  non-root  node  at  level 
i,  we  apply  the  quantifier  Qt  to  the  values  stored  in  the  child  nodes. 
Even  though  the  recursive  call  tree  is  exponential  in  size,  we  only 
require  0(n)  space  due  to  the  sequentiality  of  the  traversal. 

First,  we  iterate  through  all  of  the  rc-bit  binary  numbers,  one  per 
timestamp.  We  assume  that  the  order  over  the  variables  is  such  that 
the  leftmost  variable  in  the  formula  (the  high-order  bit)  is  the  X\ 
(the  first),  and  the  rightmost  is  x„  (the  last).  Thus,  our  addition  is 
“backwards”  in  that  it  propagates  carries  from  j q  to  jc,-_i  : 

carry (V)  <—  var_last(V) . 

one (V)@next  <—  carry (V)  ,  -ione(V). 

one (V) ©next  <—  one (V)  ,  -icarry(V)  . 

carry(U)  <—  carry(V) ,  one(V),  var_succ(U,  V). 

At  each  timestep,  we  check  whether  the  current  assignment  of 
values  to  the  variables  makes  the  formula  true.  We  omit  these  rules 
for  brevity.  If  the  formula  is  true,  then  formula_true()  is  true  at 
that  timestep. 

The  following  rules  handle  how  nodes  set  their  values  to  either  0 
or  1.  Note  that  we  only  require  2 n  bits  of  space  for  this  step:  each 
depth  1 in  the  recursive  call  tree  has  two  one-bit  registers 
(labelled  by  constant  symbols  a  and  b)  representing  the  current 
values  of  the  children  in  the  traversal. 

var_sat_in  associates  a  depth  with  a  given  truth  value  (0  or 
1 ).  This  value  is  placed  into  var_sat  at  depth  V  in  register  a  if  a 
is  empty,  or  b  otherwise.  Once  a  value  is  placed  in  register  b,  it 
is  deleted  in  the  immediate  next  timestamp.  As  we  will  see  later, 
before  with  this  deletion,  the  parent  node  applies  its  quantifier  to  the 
values  in  the  two  registers. 

The  truth  value  at  depth  n  (denoted  by  var_last)  is  the  truth 
value  of  the  formula  (formula_true())  for  the  assignment  of  vari¬ 
ables  at  the  current  timestep. 

var_sat_in(V,  1)  «—  formula_true() ,  var_last(V). 

var_sat(a,  V,  B)@next <—  var_sat_in(V,  B) , 
-ivar_sat(_,  V,  . 

var_sat(b,  V,  B)@next  <—  var_sat_in(V,  B) , 
var_sat(a,  V,  . 

var_sat(N,  V,  B)@next <—  var_sat(N,  V,  B) , 
-ivar_sat(b,  V,  . 

var_sat_left_in  associates  a  value  with  the  parent  of  a  given 
depth.  This  is  used  for  propagating  the  result  of  the  quantifier 
application  to  the  parent.  The  cases  for  existential  (exists)  and 
universal  (forall)  quantifiers  are  clear. 


var_sat_in(N,  U,  B)  <—  var_sat_left_in(V,  B) , 
var_succ(U,  V). 

var_sat_left_in(vn,  1)  <—  exists(vn) , 
var_sat(_,  vn,  1). 

var_sat_left_in(vn,  8)  <—  exists(vn) , 

var_sat(a,  vn,  0),  var_sat(b,  vn,  0). 

var_sat_left_in(vn_succ ,  1)  «—  forall(vn) , 

var_sat(a,  vn,  1),  var_sat(b,  vn,  1). 

var_sat_left_in(vn_succ ,  0)  «—  forall(vn) , 
var_sat(_,  vn,  false). 

Finally,  the  entire  formula  is  satisfiable  (1)  (satisfiable)  if  the 
output  of  the  first  quantifier  is  1,  and  satisfiable  (0)  (unsatisfi- 
able)  if  the  output  of  the  first  quantifier  is  0. 

satisfiable(B)  <—  var_sat_left_in(V,  B) , 
var_first(V) . 


